- 
                Notifications
    You must be signed in to change notification settings 
- Fork 54
kvs: support configuration of max operations count #6581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
8868cd6    to
    d245914      
    Compare
  
    d245914    to
    07ed829      
    Compare
  
    | re-pushed, removing all fence related parts of this PR, since KVS fence support is now gone | 
07ed829    to
    7b1939d      
    Compare
  
    | re-pushed, removing all the fence stuff that is no longer relevant b/c KVS fence support was removed | 
a651b3a    to
    5ed7562      
    Compare
  
    6bcbcfe    to
    3706f06      
    Compare
  
    | Some small set of transaction stats the max number of ops in a transaction is 398, which means my default cap of 64K seems more than fine. I'll still monitor stats until release time, but removing WIP for now so this can be reviewed | 
| saw this today max of 11K ... perhaps upping to 128K over 64K would still be good but a bit safer. | 
3706f06    to
    a8e4ffa      
    Compare
  
    | Before we do this, we should probably investigate why we are seeing these high op-count commits. I think stdout events are batched by this code, which triggers the commit of a batch of events based on a timer. Maybe it should also have a high water mark on the operation count and/or the cumulative size of the events. https://github.com/flux-framework/flux-core/blob/master/src/common/libeventlog/eventlogger.c If we can make sure flux-core code is well behaved, then it makes sense to  me to impose a limit to avoid regressions and bad behavior by framework projects, but I should think we could set it much lower than 128K.  We should also provide some way for API users to know what the limit is.  For example, then the above code could set its high water mark accordingly.  Maye we could even just make it a constant in  Note this PR in the title and a few other places uses "transaction" where "operation" is meant. IMHO the title should be "kvs: limit the number of operations per commit". | 
Problem: A KVS denial of service is possible because there is no maximum on the number of operations a user can submit in a KVS transaction. For example, a KVS transaction with billions of KVS entries would lead to a severe degradation in KVS performance. Support a new KVS configuration "transaction-max-ops" that will reject KVS transaction with operations above a maximum count. The default maximum is 131072. Fixes flux-framework#6572
Problem: The new kvs transaction-max-ops configuration option is not documented. Add documentation to flux-config-kvs(5)
Problem: There is no coverage for the new kvs transaction-max-ops configuration. Add coverage in t1005-kvs-security.t.
a8e4ffa    to
    a93874b      
    Compare
  
    | Codecov ReportAttention: Patch coverage is  
 
 Additional details and impacted files@@           Coverage Diff           @@
##           master    #6581   +/-   ##
=======================================
  Coverage   83.83%   83.84%           
=======================================
  Files         539      539           
  Lines       90283    90305   +22     
=======================================
+ Hits        75693    75713   +20     
- Misses      14590    14592    +2     
 🚀 New features to boost your workflow:
 | 
| 
 Good point. I'll write up an issue to investigate this issue as well. However, this PR was initially developed under the idea to defend against a denial of service, so I think it is worthwhile to have the max independent of it. At this moment, a rogue user (or misbehaving code) could create a KVS transaction with like a bajillion operations in it right now. | 
Per discussion in #6125, denial-of-service attacks could be made against the KVS by very very large KVS transactions.
Support two configurations for capping the number of transactions made by users. One for each individual transaction made by a caller and one for the combined total of operations from a fence.
For the time being, I made the default 64K for the transaction cap and 1M for the fence cap.
I made this WIP only b/c those defaults may be tweaked depending on what stats we get from the prior PR #6556. I would like to merge only after we gather a bit of data, although I'd be quite shocked if we have to adjust the defaults. Edit: Or alternately, if we'd like to just get the code in, we could default the max to LLONG_MAX and lower the default at a later time.
Only other thought is I decided to return the errno E2BIG if we went across a max cap boundary. It's possible there is a superior errno for this, I picked it b/c I thought "ehhh that's not bad".